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ABSTRACT 

The Bayesian approach to setting passing scores, as 
proposed by Swaminathan, Hambleton, and Algina, is compared vitTa. €:he 
empirical BSyes approach to the same problem that is derived ftpm . ; 
Huynh»s decision-theoretic frameworlc. comparisons are tmsed on 
simulated data wiiich follow an approximate beta-binomial distribution 
and on real test results from the Comprehensive Tests of Basic Skills 
administered in the South Carolina Statewide Testing • Program, Both 
procedures lead to setting identical or aluost 'identical _^passin<^ 
scores as long as the test scor^ distribution is reasonably symmetric 
or when the ninimum mastery level or criterion level is high- Larger 
discrepancies tend* to occur when this level is low, especially when 
the distribution of test scores is concentrated at a few extreme 
scores or when the frequencies are irregular. However, in terms of. 
mastery/nonmastery decision, the two procedures result -in the saiae 
classifications in practically all situations. The empirical Bayes" 
procedures may be used for tests of any length, while the Bayesiarf. 
procedure is recommended only for tests of eight or more items. 
Further, the empirical Bayes can be generalized and applied' to more" . 
complex testing situations with less difficulty than the Baye^sian 
procedure. (Author/CP) 
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. . . ABSTRACT 

The Bayesian approach to satiing passing scores as proposed by 
Swamlnathan, Haobleton, and-Algina is comparad with the empirical 
Hayes approach to the s&sm. problem that is derived from Huynh's 
decisiou'-theorttic framework. Compatisons are based otx sinuiatad 
data which follow an approximate beta-biaomial distribution and on 
real test data sampled from, a statewide testing program. It is 
found that the two procedures lead to setting identical or almost 
identical passing scores as long as the test score distribution is 
reasonably symmetric or when the minimum mastery level or criterion 
level is high. Larger discrepancies tend to occur when this level 
is low, especially ;*hen the distribution of test scores is concen- 
trated at a feW extreme scores or when the frequencies are irregu- 
lar. However, i;i terms of mastery/nonmaatery d^isions, #thft two 
procedures result' in the same classifications in practically 'all 
situations. However, the empirical Bayes procedure may be uded for 
tests of any length, while the Bayesian procedure is recownended 
only for tests of 8 or more items. Additionally, the empirical 
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Bayers procedure c^n be generalized and applied to more complex 
testing situations with less difficulty than the Bayesian procedure. 



1> INTRODUCTION ' ^ ' 

Among the man^ decision- theoretic approaches to setting pass-- 

ing scores (or standards) for mastery tests, there are at least two 

If. ^ . 

ijiethods which rely on test dat^ collected from a group of examinees. 
The Bayesian procedure, as piresented in Swaminatjian, Hambleton, and 
Algina (1975), assumes that prior knowledge regarding the examinees 
\| is exchangeable (Novick, Lewis* & Jackson, 1973) and can be quanti- 
fied in some appropriate manner. On the other hand, the empirical 
. ^B^yes approach, ag* formulated in Huynh (1976a), uses only the true 
. f ability distribution of the ^icaminees and makes no assumption re--* 
j garding prior- knowledge about the examinees. Both procedures use 
' testAdata collected from a group of examinees and establish passing 
' scores for mastery tests by minimizing certain loss functions. The 
I purpose df this paper is to present a comparison, of the two sets of 
standards (passing scores) fojnnulated under a variety of conditions - 
which^can be .expected to be enoountered in ma&tery testing or in 
minimum competency testing. The comparison will be made first on 
t^?« basis of approximate beta-binomdal test scores. Further com- 
parisons willN^ madfe using the Comprehensive Tests of Basic Skills 
(CTBS, 1973) datXcollected in the 1978 South Carolina Statewide - 
Tasting Program. 

. AN bVERVIEW OF THE BAYESIAN AND 



EMPIIil CAL BAYES APPROACHES 

Overall Framework 

Th^ Bayesian framework as presented .by Swamina than • e t ' al . and 
the spee^ial -empirical Bayes procedure described in Hujr^ (1976a-, 
p. 70-73) start with a typical four-corner setup used In decision 
theory. (See Figure I> p« 16, for the b^Mc elements of this setup •) 
Let 9 (tt in the notation of Swamln'athan e_t al_. ) be the true score (or 
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. BAYESIAN & EMPIRICAL PASSING SCORES • ^ 

trjie ability) of an examinee anci x be the observed , test score as 
obtained from an n-item test. For the binomial error model adopted 
in both standard setting approaches, 6 is the proportion of items 
- in a real or hypothetical item.ppol^ that an examinee answers cor- 
j ' rectly. Let a person be called a master if that person '.s true 

score 6 is such that 9 > 8 and a nonmaster if 0 < 6^. Here, 9„ is 
a given constant which defines the lov^r bowidary of the inastery 
level or the cr;tterion level.. Since a persoti^$ true score cannot 
be observed directly, decisions about whether to call the person a 
# master must be based on an observed test &core« _ What remains to be 
determined is the cutc^f score c that will be in some sense optimal. 

On the basis of the test score x, a person is called a master 
if x\> c and a nonmaster if x < c. A correct decision is made 
whenever either (a) 9 > Q and^^ > c, or (b) 8 < 9 and x < c. 
Otherwise, either a fals6 positive error (9 < 9^ and x >^ c) or a ^ 
false negative error; -(0 >^ 9^ and x < c) is encountered. 

In the case where the loss j^ssociated with each error is ton- 
\ atant, generality is not diminished if we let the loss incurred by 
a false positive -error be equal, to l^^nd that associated with a 
false negative error be equal to Q. Here, Q expresses the ratio of 
the false negative error loss to the false positive error loss, . 
* \ (In the notation of Swaminathan et al,, Q ^^21^^12*^ 
Bayesian Approach ' j . 

^ \ ^ flow lefc^ an n-item If est be given to m examinees • In the Bay^s- 

ian procedure as iii^jletoented by Swaminathan et al . , the prior in-- 
formation regarding the examinees is assumed to be exchangeable 
(l.e:«, prior knowledge regarding one examinee can be interchanged 
with that associated with another examinee without causing any dis-- 
turbance in the decision, problem) . The model requires knowledge . 
) (prior belief) of the distribution of the variance of true scores 
for the group. (Ifa point of fact, an arcsine transformation of 9 
is used.) This prior distribution Is taken to be the inverse chi- 
souare distribution with parameter X and degrees of freedom v. A 
recommended choice of v is 8 (Novick, £t al. , 1973). 
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To assess X, let t be the tiumber of test* items which would 
need to be administered to a typical examinee in otder to obtain as 
much information about that examinee*^ €1 as already have. Then, 
I « 3/(2t+l). , Wang (1973) has tables to facilitate computation in 
thia procedure. In the setup of(!>the Wang tables, X/v is chosen as 
.01, . .02, .03, .04, anil .05'. These rat;los correspond, to the t val- 
^ ues of 18.25, 8.875, 5.>5, 4.1875, and 3-25. Given the prior infor- 
mation as revealed through X and v and the ta^t data of m subjects, 
it is possible via the Wang tables to compute th4 two ejected 
losses: Pr(8 < 8^ | test data) and Q*Pr(8 > 8^ | test data)' at 
each test score. A Bayesian passing s^core is then' the smallest 
score at which^the first expected loss is smaller than the second 
one. More details may be foiind in Swaminathan ^ £l . (1975) and 
in Novick et al. (1973); . , 

Empirical Bayes Approach 

The empirical. Bayes solution assumes that\the m examinees 
, constitute a random sau^le from a population for which the true 
ability & follows a known distributional* form- such as the beta 
density with parameters a and 3 (Keats. & Lord, 1962, page 68). 
Sample test data are used %o obt;ain the estimates a and S, and the 
results are used to co^ute the probability of a false positive 
decision Pr(8 < 8^, x >_\0^and of a false negative decision 
Q*Pr(e >^ 8^, < c) at a giv^jTcutoff score c. The optimum passing 
score (henceforth referred to simply as the passing scora ) will be 
the value of c at which the a^verage loss, Pr(9 < 8^, x >^ c) 
+ Q*Pr(9,> 9 , X < c) , is the smallest. 

The procedure is implemented as follows. Let x and s be the 
mejan and standard deviation of the test acores, and let the Kuder- 
Richardson reliability coefficient be defined as 
n 



^21 -n-l 



^ ^ x(n-x) 



2 
ns 



Then 



• a - (-1 + l/a23^)x 

and 



BAYESIAN- &■ EMPIRICAL PASSING SCORES , 5 

B - -a + W*2i " ^' 
For test scores 'with. insufficient variability, may be negative. 
If this occurs simply replace a^^^ by the smallest positi-vee relia- 
bility estimatje which happens to be available. Let I denote the 
incomplete beta ftmction as tabulated in Pearson (1934) and imple- 
mented via computer programs such as the IBM Scientif;[,c Subroutine 
package (1^71) ot the IMSL (1^77). Then the passing, score is the 
smallest integer c, at which' > , . 

I(a+c,afS-c;e^) < Q/(1+<J): (1) 
A normal' approximation is available if there is a sufficiently 
large number of items and if 9^ is not n^r 0 or 1. Let i denote 
the 100/ (1+Q) percentile of the unit normal distribution. Then the 
tes^t passing score is nearly equal to 



c « (n+a+$-l)9^ + e 



^ -. i + .5. ' ' (2) 



(n-f<i+e-l)9^(l-9^) 

'The data presented in Huynh (1976bP indicate that the passing score 

comput.ed from Equation (2) does not differ appreciably from the one 

deduced from Ine<iuat ion (1) when the test consists of 20 items and 

when 9 is within the range from .50 to- .80. 
o 



3. 'A CO>gARISON OF BAYESIAN AND EMPIRICAL BAYES 
*• . '■ PASSING SCORES FOR APPROXIMATE 

BETA-BINOMIAL TEST DATA 

The passing score obtained via the empirical Bayes approach, 
as revealed by Inequation (1) , is based on test score data that 
follow a beta-binomial distribution. It may be of interest; to 
compare the Bayesian approach to setting a passing scpre with the 
empirical Bayes approach, using test data which follow closely a 
beta-binomial form. * * 

Both the ptesent compari3t)n and the one detailed in the next 

section are Jsased on tests with ten items. In these comparisons, 

the criterion or minimum* mastery level is set at 9 ■ .60, .70» and 

.80. The loss ratio, is chosen to be Q « .25, .50, A. 00, and 2.00^ 

(A loss ratio sma^lier than one indicates that a falSe positive 
« 

error is less serious than a false negative error.) To compute a 

passing score via the Bayesian approach, it is necessary to specify 
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the ratio X/v or, equivalently, the quantity t as described in 
Section 2, It may be rjscalled that t may be interpreted as the 
number ofW'test items" which are believed to be as infofinative as 
the prior belief about the e?«minee§. In practical situations Xn- 
volving standard setting, it seems unreasonably to let the priSr, 
belief a; carry as much weight as the objective test data* In other 
words^ it is unlikely that t is tog close to n/ Thus for the 
comparisotis based on 10-item tests reported iji t;his section and in. 
Section 4 as well as the comparisons based on 20-item tejits 
described in Section 5, the t-'Values are chosen to ^ 8.875 
a/.v - .02), 5.75 (X/v - .03), 4.1875 G/v - .04), and 3.25 ; 

(x^/v - .05). ; " , 

The first five test score frequency distributions (labeled Al 
through A5 in Table 1) aerve as the^ data base for the comparison of 
the passing scores computed by the two procedures using test score 
distributions that are approximately beta^binomial. Each^ is delib-- 
erately chosen (i) to yield an s value (variance of the arcsine- 

sqtiare-root transformation of the test scores) confortaing as closely 

2 ' * " • 

as possible to the tabulated s values of the Wang tables (so that 

$ < * 

no interpolation would be necessary) and (il) to reflect several 
degrees of skewness and variability thought to be typical of mas-- 
tery testing sijiuations. (Also in Table 1, and explained below, 
are distributions of actual te^gft scores from the South Carolina 
Statewide Testing Program* ) It may be noted that* in Table 1, the 
quantity D(%) represents the maxii^ium per^gjK difference between 
the observed and beta-binomial-f itited d!luiiu\ative 'frequencies. A 
small D-*value indicated a good fitl 

Table 2 reports the Bayesian oasaing scores and the corre- 
sponding empiriiial Bayes passin^jj^dores (in Italics) for several 
combinations of 9 , Q, and t. The kata indicate that for the situa- 
tions under consideration, the Bayesian and empimcal Bayds passing 
scores are .identical, or nearly so, las long as the test score dis- 
tribution is reasonably symmetrical l( Cases 'A2, A4, and A5) ♦ For 
highly skewed distributions (Cases AB. and A3) the ^o passing 
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TABLE 1 



Frequency Distributions of Test Scores Used 
. * in Comparisons o,f Passing Scores 



Data Source/ 



Skew- 



FreQt^^ncy -at score o 



r 



Set 


Sul>test ' 


m 




S.D. 


ness 0 


1 


2 


3 


4 


5 


6 


7 


8 


9 10 




App'roidinate ] 


Beta-Binomial 
















< - 

6 






Al 


Fictitious 


40 


^3.1 


1.36. 


-0.61 . 










1 


.3 


8 11.11 


A2 


Fictitious 


80 


1.0 


1.87 


/-O'.Sl 




1 


3 


6 


10 


13 


16 


15 


11 5 


A3 


Fictitious 


40 


- 1.2 


1.01 


-1.51 












i 


2 


4 


10 23 


A4 


Fictitious - 


40 


1.6 ■ 


2.01 


-0.02 


. 1* 


3 


5 


6 


• 7 


7 


5 


4 


2 0 


^A5 


Fictitious 


40 


1.0 


2. 15 


0.12 1 


3 


5 




7 




5- 


4 


2 


1 *0 




Coisprehensive Tests of Basic Skills 




• 














•> 




Mathematics 






























concepts and 


















■ \ 


V 






A 7 




applications 


^20 


W 


1.28 


-0.63 . 












1 


6 


B2 


Mathematics 




























computations 20 


9.2 


'1.45 


-0.24 












3 


-4 


3 


4 6 


B3 


Spelling 


20 


6.1 • 


1.76 


-1.04 ■ 








2 


0 




2 


6 


4 5 


B4 


Social 






























studies 


40 


6*2 


2.11. 


. 0.27,- 


' 1 


4 


5 


9 


5 


5 


6 


3 


1 1 


B# 


►Language 


























3 '2 




expression 


40 


8.2 


1.86 


-0.53 




1 


1 


5 


3 


4 11^0 


36 


Reading 


40 


4.1 


1.22 


-2.12 










1 


1 


2 


3 


3 30 


B7 


Science 


60 


5.6 


1 . 74 


-0.22 






2 


6 


10 


8 


14 


8 


12 0 


B8 


Reading 


























16 29 




vocabulary 


60 


3.2 


1.56 


-1.75 






1. 


0 


3 


1 


5 


. 5 


B9 


Reading 


























23 30 




vocabulary 


80 


2.7 


1^68 


-1.49 






2 


1, 


2 


5 


t6 


11 


^BIO 


Spelling 


80 


2.1 


1.50 


-1-44 






1 


0 


2 


4 


7 


12 


16 38 



total number of scores in the dil'tribution . 



. D(%) represents the maximum percent difference between the observed 
and beta-binomial-fitted ciitoulative frequencies. Ail are not sigr 
nif leant ^it the ten percent level of significance. 

scores rarely differ by more than one unit when the criterion level 
8 is relatively high' (.70 or .80) and when \/\> is such that t is 
not too close to n, say when, X/v is at least .03. Large discrepan- 
cies, however, may occur at a low criterion level such as/. 60 or 
when t is close l;o..n. 
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TABLE 2 

Bayeaian and Empirical Bayes Passing Scores for Five 
Approximate Beta-Binomial Test Score Distributions 

' Bayesifin (at X/v • .02, .03, .04, ,05) 

Data and empirical Bayes (in italics) at 

Set Q - .25 . Q - .50 Q - l.QO * Q - 2. 00 

Al .60 4, 5, 6, 6, 4 3, 4, 5, 5, 2 2, 3, 4,^4, 1 1. 2, 3, 3,"i) 
.70- 7, 8, 8, 8, 6 6, 7, 7, 7, 5 5, 5, 6, 6, 4 4, 4, 5, 5, J 
.80 10,10,10,10, 9 ■ 9, 9, 9, 9, 9 8, 8, 8, 8, 7 7, 7, 7, .-7, 6 

A2 .60 7, 8, S,. 8, 7 6, 7, 7, 7, 6 5, 6, 6. 6, 5 ^4, 4, 5, 5, 4 - 
.70 10,10,. 9, 9, 9 9, 9, 9, 9, 9 '8, 8, 8, 3, 8 7, 7, 7, 7, 7- 
.80 10,10,10,10,10 10,10,10,10,20 10,10,10,10,20 9, 9, 9, 9, 9 

A3 '.60 1, 3, 4, 4, -3 .1, 2, 3, 3, 2 0, 1, 2/ 2, 2 0, 1, 1, 2, 0 
.70 4, 5, 6, 6, 6 3, 4, 5, 5, 5 .2, 3, 4, 4, 4 1, . 2, 3, 3, 3 
, .80 8, 8; 9, 9, 3 7, 7, 8, 8, 7 5, 6, 7, 7, 6 4, 5, 6, 6, 5 

A4 .60 9, 9, 9, 9, 9 .9, 8, 8, 8, 5 8, 7. 7, 7, S 7, 6, 6, 6^-^ 
.70 10,10,10,10,20 10,10,10,10,20 10, 9, 9, 9,20 9, 9, 8, Sf ^ 
. .80 10,10,10,10,10 10,10,10,10,20 10,10,10,10,20 10,10,10,10,20 

A5 .60 10,10, 9, 9,2.a 9, 9, 9, 9,^9 8, 8, 8, 8, 8 7, 7, 7, 7, 7 
^ .70 10,10, 10,'a.0,iO 10,10,10,10,20 a^CUlO, 9, 9,20 9, 9, 9, 9, 9 
.80 10,10,10.y,20 10,10.10.10,20 ioTferOt 10, 20 10,10,10,10^-20 

>,•.■ ■■■ * ' • ' " 

4. A COMgARXSON OF BAYESIAN AND EMPIRI CAL ' 
BAYES PASSING SCORES FOR CTBS TEST DAW 
I • ' . ■ 

This phase, of -the study is based on a 10% systematic, sample 

of the entire tihird grade CTBS-Level C data file compiled during the 

1978 South Carolina Statewide Testing Program. To obtain the fre- 

quency distributions labeled as Bl to BIO (in Tables 1 and 3), the 

following procedure was used. First, ten 10- item subtests were 

assembled by random selection of items frpm each CTBS subtest. 

Next, for each 10-itdm subtest, a frequency distribution was con- 

str(ucted for each school district which had* at ^ least . 20 students in. 

2 

the systematic saaq>le, and the corresponding s value was obtained. 

2 ® 
(The s values were distributed as follows: .10 to .50 (32%), .51 
' g .. ,• " 

to .75 (38%), .76 to 1.00 (20%), a^ more than 1.00 (10%). Urge 

2 ^ ' ' * 

3 values tended to associate with subtests dealing with readi"ng 

coB^rehenalon (sentences or paragraphs), language expi^ession," and 

lariguage mechanics.) Third, among the frequency distributions with 

a values included between .01 and ;05, tan were finally selected 
g • ' 
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'and altered slightly so that the total number of examinees (m) was 
exactly 20, 40, 60, or SO. 

Table 3 lista the Bay-esian and empirical Bayes passing scores 
under a variety of -conditions. As in^thfe previous section,, the data 

TABLE 3 

^ Bayesian and Empirical Bayes Passing Scores 

. for Tetj CTBS Test Scqre Distributions 

I . ^ 



Bayesian (at X/v - .02. .03, .OA, .05) 
Data and empirical Bayes (in italics) at 



Set 


o 




Q - .25 






- .50 






Q - 


• 1.00 






Q - 


2. 


00 




Bl 


.60 


5, 


5, "6, 6, 


3 


4; 4, 


5, 5, 


2 


3, 


3, 


4, 4, 


■1 


2, 


2, 


,3, 


3, 






.70 


7, 


7, 8, 8, 


6 


6, 6, 


7, 7, 


5 


5. 


5, 


6, 6, 




4, 


4, 


5, 


5. 


"3 


^ 


.80 


10. 


10,10,10, 


9 


9, 9, 


9, 9, 


8 


.8. 


8, 


8. 8, 


7 


7, 


7; 


"7, 


7, 


6 


B2 


.60 


6. 


6, 6, §, 


5 


5. 5, 


5, 5, 


4 


4, 


4, 


4, 5, 


2 


3, 


3, 


3, 


4, 


1 




.70 


8, 


8, 8, 8, 


, 7 


7, 7, 


7. 7, 


6 


6, 


6, 


6, 6, 


5 




5, 


5, 


6, 


4 


> 


.80 


10, 


io;io,io, 


, 9 


9, 9, 


9, 9, 


. 9 


8, 


8, 


8, 8, 


B 


7, 


7, 


8, 


8. 


7 


B3 


.60 


6, 


6, 7,, 7, 




5| 5, 


6, 6, 


'6 


4, 


4, 


5, 5, 


5 


-3, 


4, 


4, 


4, 


4 




-.70 


3, 


8, 8/ 8, 


% 


7, 7, 


.8, 8, 


7 


6, 


7, 


7,, 7, 


6 


5, 


6, 


6, 


6, 


.6 




.80 


10, 


10,10,10, 


10 


9, 9, 


9, 9, 


, 9 


9, 


, 9, 


9. 9, 


8 


8, 


8, 


a. 


8, 


V 


B4 


.60 


9, 


9, 9, 9, 


. *9 


9; 8, 


8, 8, 




8, 


8, 


7. 7, 


, 7 


7, 


7, 


6, 


6, 


7 


« * 


.70 


10, 


10,10,10, 




10,10, 


10,16, 


.10 


10, 


9, 


9. 9, 


9 


9, 


9, 


8, 


8, 


9 




.80 


10, 


.10,10,10, 
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show that tha two sets of passing sqores are the same, or nearly 
so, as long as the. test score distribution is* reasonably feytnmetric 
(see caaes"B4, B5, and B7). Discrepancies* in these situations 'are 
rarely larger than one unit. Fot laost other situations, the dlf- 
fetrence between the^t^o values for a passing scorja is seldom larger 
than one unit, when the criterion 6 is .70 or^ /SO and when X/v is 
at least .03. The same magpitude of dif ference;^^ ope unit, also 
tends to hcJld at 9*^ • ^60 unlells the teat scores -t^le up at extreme 
values (Case or unless the frequencies ^re fairly irregular 
(Case Bl). 

5 . ADDITOONAL -aATA FOR MODERATELY 

SKEWED DTSTRll^TIONS . * 

' f " ' ' 

Additional cosroari^ons were made for' ten 20-item test-a with 
• •. . . * v. ' ^• 

distributions having- skewneaa ranging from -I. 109 to .117 (see 

Table 4). These ^^testa w?ira afe^embled in the^same way as the 10- 

. item tests described in Section 4. As in the previous ^ectibna, 

the criterion le!vel 9^ waa set at .60, .70, and .80, and the loss 

ratio Q at .25,/ .50, 1.00,' atod 2.00. The p^jrior knowledge about the. 

examinees w&s asstmad to be equivalent- to a ntimber of .items, t, of 

8.875 (X/v ^ .02), 5^,75 (X/v « .03), 4.18.75 (x/v i .04) and- 3. 25 

(X/v - .05). For all the 480 combinations under consideration, the 

' ' ' TABLE 4 

. Frequency Distribution of Scores on Ten CTBS Subtests; 
V Mentioned in Section 5 
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absolute valu^ of the discrepancies between the twp computed 
pasaixig ^scores are distributed as follows:' 0 (35%)., l„(37%y, 2 
(lp%) , 3 (5%) . and 4 br mote (8%) Hence in ahout three-fourths of 
all sitti^tioiis, the Bayesiatt' and empirical Hayes passing scores do 
not differ from each other by more than one unit. 

'6^ . AGREEMENT OF MASTERY/NONMASTERY DgCISIONS 

Xs noted in Sjection 4, there are situations (such as softfe ^ 
cases associated with the Al, Bl, and B6 data sets) where tjie pass-- 
ing acdrea obtained, from the j3?o methods differ ipprieci^bly . this 
may seem di^Kearteoittg. , However, the procedures provide mastery/ „. 
nonmastery classificatiotts which are in- hi^ agreement for most" 
daaeaf under consideration. For Data Set Al with 9 • .60 and ^70, 
for example, the combined proportions of students' identically clas-^' 
sif iad»-in either the mastery or nonmastery category by the Bfiyesian 
procedure ^wlth X/u - .05) and fey the empirical Bayes procedure' are 
88%, 95%, 99%, and 100% for Q .'25, .50, 1.00, ^jijid 2,00 respect- 
ively. O^er the fifteen data sets of Table '1 and with tl^^ same ^ 
'valued for X/v and Q, the proportions ^of identical clasaiflcatidns 
reach 94%, 96%y 98, and 97% respectively. As for the data of 
Table 4, these proportions stind at 98%, 98%, 98%, and 97%. 
' . .. Though the overall agreement for classifications is high for 
the dat'a considered in thi^ study, some individual cases may show 
less agreement than others. These cases include situations such as 
A2 <*ith S - .60, Q - .25. and X/v - .05 where the^Bayesian passing 
score of . 8 and. the empirical -Bayes passing score df 7 are located 
near the.canter of the tes^. score distribution. The shift of only 
one unit in test score in this case actually cawses 16 students out 
of a total of 80*to-be classified differently by the • two procedures 
Visiblg disagreement between the classificatigps defined by t^e 
Bayes ian and empirical Bayes proc^ures may occur in si?ti>ations 
where scores with high 'frequencies of occurrence are s'elepted as 
the passing scores, rif this is ^ the case, the, pjroportion of stuT 
dents .classified in the mastery (or noninastary) categpry is not i 
likely to Se o±ose- to either 0% or 100%. in other situations where 
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iao3t , Students are declared masters (Dat4 ,Set Al with Q^** ^ ^ 

.X/v «:^05, and Q 2.00) or ironmasters (Data* Set A5 with 9^ - •70, 
*X/v ^ ..pSj-and Q * 1.00),' tKe agreement^ in classifications -is \ 

♦ 1 ' ■ * ' * 

aliabst perfect.' • ' ' > ^ * ^ 

7. DISCTJSSION AND' CONCLUSION. 

. ' The results described in praviousp settt^Lons may be siimniarized, 
as fdlloVs:- Xi) Bayesian passing scores and thpse " coarp\it,ed via' the 
empirieal Bayes procedure , are i4ehtical or aimos't identical 4s l,ong 
as^-the t^st spore frequency distribution, is reasonably -synmetric or 

^ when the 'criterion level 8 is * sufficiently high' <.70^^r v 80) ; 

4 O- • » - » 

Xiiy^large; discr^^ may occu? at cr|.terioTi 

'levels of .60- (or Iklpw), edpeciatly wh^n the test scores pile up 
at a few extreme values or when the frequ^tncy distribution is 
irregular; (iii) however^ mastery/nonmaster^ decisions -derived from 
the two procedures are most often identical. Overall, the combixied 
proposition of students similarly classified by both procedures is- 
about 97%.' ' r • 

All in all, there is little difference between the Bayasian 
approach as desctibed b34fewaininathan et al. and the Huynh empitical^ 
Bayes procedure <ie^\:ril)e^ere, ; either in terms of the resulting 
^passing scores ok terms .of the maistery/no^aastery categorization. 

ft-stHJUltrte pointed out that the procediire by Swamihathan et 
-al. relies on "a normal arcsine-squar's-root transformation of the- 
test data and is therefore considered adequate only when the test 
has at least 8 items. In addition, the scheme fequire^ the evalua- 
tion of certain posterior probabilities. This may be done via the 
MARPRO computer program (mentioned in Wang; 19.73) or yia the Wang 
tables* To the chagrin of the writers, many frequency distribu- 
tions such as those derived from the CTBS test data of the South 
C^olina Statewide Testing Program have s values much larger - than 
the upper bound of .05 allowed in the above-mentioned tabled. In 
addition, the coifetraint of having at least 8 itema seems to be 
•quite severe in maiiy practical situations involving objective- 
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referenced "testing. .Such tests* fl^eqnently have 5 orlfewer items ■ 
fer objective. ^ ^•'^ , , . ^ 

■ " . •pie .^lipirical Bayes approach tn its simplest form, aa pre- 
Vented Eu3mh (1976a) , ' requires tj\at the test scbres follow a 

I . . *■•'■■'".>•.♦.• 

Ijeta^bitloinial, distribution. : There' are iiTdtcations ' (Keata & Lord, 
.1962; Duncan^. as 74 i Huynh & Skunders, 1979; al^d^eat Table !•) that 
-t^li^ iijodel aaaqjuately fits mahy ^est score distributipns. Moreover, 

,is\l«iown /j^SubkoViak, 19^; Huynli & Saunders, ' 1979) that, the 
mode?. S§ useful in the Estimations of tl^e reliability of mastery 

/classification based on one.. test administration. In addition, 
■ " '\ ' ' * ; „' -■ ^ ' _ _ _ . 

'usitig th^ en^frit^ai Bayes approachV passing" 

\ *' ' - > 

for test's of any length and 'can be approximated quickly via 

Equation C2). " ' , ^ ; ^ " 

It,iftay be noted that f^fe Bayesian and empirical Bayes proce- 
dures discussed in this paper deal with the setting of passing 
■^^res for a. particxilar test. Both procedures assume the availabil' 
ity.,o£' a minimum mastery or criteifioa level 9 and the availability 

of other. information such as Q, the r stiff, pf the loss incurred by 

. J- . ..... , 

a false positive decision .to that incurred by a falSe negative one- 
in the' context of testing for instructional pui?poses, 9 roay^ be 
based on.. the Judgment of a cutriculum. specialist or a knoVledgeable 

teacjher 'and Q may .be assessed via the time losses encounfered by a 

■ ■ ♦ ^- ' . ■ '■ ■ 

misdecision (Huynh^ 1976a). The issue is much' more involved for 
ehd-of-program certif ication, such as high^chool graduation (mini-, 
mum coiSipe^ency) t^^ting prograsns legislated in several states. The 
reader.is referred to Jaeger (1976) x and Shepard ■ (1976) for insight 
regarding some of these issues.. 

fhe empirical' Baye? approach with the availability of a pre- 
determined ^criterion level, . hoWever, is only ^he. simplest form of 
the general "framework ^f mastery evaluatfon as approached by Hiiynh 
.(1976a)>.* The easet^tial component of this, model is an external task 
(real, or hypothetical)" that examinees are "Supposed to perform once 
they are granted mastery of the objictives or conCenL upon which a 
test is based. Such an external task may be i^^ntified in the 
cdrftexi; of instruction, especially wijen instructional units are 
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sequenced fn some logical order • If this requirement is fulfilled, 
t^e specification of ' 9^ 'is no longer necjessary. Some suggestions 
^ f6r solutions along this line have been presented elsewhere .(Hi^ynh^ 
1976a, p. 73-Z5; Huynh, 1977; Huynh & Pemey, 1979). To the 
knol^ledge of the writers, the Bayesian approach as presented by 
Swaminathan.et al« has not been generalized to' situations other 
l^han those involving constant losses and when a criterion level is 

available,. Although such a generalization may be made, the numer- 

^ ' * # ■ ♦ 

ical analysis would be more ^involved than can be expected from the 
empirical Bayes approach. 

studied in this paper are based on group data and therefore are 
appropriate to the extent that minimization of loss is considered 
for the entire group of exaininees. Ttjis may be the case for mini- . 
mum competency testing where resources for remedial^ instruction are 
limited* Procedures relating ^o standard^ setting in the absence of 
group data are available (see, for example, Huynh, 1978) • 

In concl^aion, the ^empiyical Bayes approach yields , mastery/ 
nonmastery decisions identical ^n mo^t cased to those based on the- 
B^esian approach. In addition, the former approach is simpler in 
^' terms of computations, is applicable to any test length, and has 
been generalized to more complex testing .situations. - 
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